We conducted the exploratory data analysis for the Covid-19 data in Ohio in 2021. First, we get a glimpse of the data
Rows: 1,175,665
Columns: 30
$ id <chr> "0007cb93", "0007cb93", "0007cb93"…
$ date <date> 2021-01-01, 2021-01-02, 2021-01-0…
$ confirmed <dbl> 236, 239, 240, 241, 243, 245, 247,…
$ deaths <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ people_vaccinated <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ people_fully_vaccinated <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ school_closing <dbl> -2, -2, -2, -2, -2, -2, -2, -2, -2…
$ workplace_closing <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ cancel_events <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ gatherings_restrictions <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
$ transport_closing <dbl> -1, -1, -1, -1, -1, -1, -1, -1, -1…
$ stay_home_restrictions <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ internal_movement_restrictions <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ international_movement_restrictions <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
$ information_campaigns <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ testing_policy <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
$ contact_tracing <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ facial_coverings <dbl> -2, -2, -2, -2, -2, -2, -2, -2, -2…
$ vaccination_policy <dbl> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2…
$ elderly_people_protection <dbl> -2, -2, -2, -2, -2, -2, -2, -2, -2…
$ government_response_index <dbl> -57.45, -57.45, -57.45, -57.45, -5…
$ stringency_index <dbl> -54.63, -54.63, -54.63, -54.63, -5…
$ containment_health_index <dbl> -54.94, -54.94, -54.94, -54.94, -5…
$ economic_support_index <dbl> -75, -75, -75, -75, -75, -75, -75,…
$ administrative_area_level_1 <chr> "United States", "United States", …
$ administrative_area_level_2 <chr> "Georgia", "Georgia", "Georgia", "…
$ administrative_area_level_3 <chr> "Schley", "Schley", "Schley", "Sch…
$ latitude <dbl> 32.29202, 32.29202, 32.29202, 32.2…
$ longitude <dbl> -84.31604, -84.31604, -84.31604, -…
$ population <dbl> 5257, 5257, 5257, 5257, 5257, 5257…
We study the outliers in the distribution of the cumulative number of confirmed cases to know which counties have the highest number of cumulative number of confirmed cases at the beginning of year or in the end of 2021.
We found that the maximum cumulative number of confirmed cases at the beginning of the year occurred in the Franklin county and in the end of the year is Cuyahoga county.
| Country | Cumulative Cases | Cumulative Deaths | No. of Vaccinated | No. of Fully Vaccinated | Population |
|---|---|---|---|---|---|
| Franklin | 84050 | 698 | 0 | 0 | 1316756 |
| Cuyahoga | 227664 | 2828 | 819155 | 757138 | 1235072 |
We find the minimum cumulative number of confirmed cases and the minimum number of cumulative number of deaths at the beginning of the year and the corresponding counties where these values occurred.
| County | Population | Confirmed | Deaths |
|---|---|---|---|
| Vinton | 13085 | 513 | 11 |
| Harrison | 15040 | 692 | 3 |
We find the maximum cumulative number of confirmed cases and the maximum number of cumulative number of deaths at the beginning of the year and the corresponding counties where these values occurred.
| County | Population | Confirmed | Deaths |
|---|---|---|---|
| Franklin | 1316756 | 84050 | 698 |
| Cuyahoga | 1235072 | 70137 | 897 |
Here is another tab.